Analysis of Hurricanes in the Atlantic Basin

Background

The HURDAT dataset is provided by the NOAA, containing data in six-hour time intervals for all hurricanes and tropical storms since 1851. For the purposes of this analysis, we will look at all hurricanes in the North Atlantic Ocean, affecting North America and the Caribbean.

The data is provided as a text file, containing a multidimensional array for storms and individual datapoints for each time interval. The purpose of this notebook is to parse this raw data into cleanly formatted and labeled data that can be used for data exploration and analysis.

This data cleaning will include creating features for the tropical storm/hurricane as a whole (e.g. did it make landfall/what was the max category/duration of system).

Contents

  1. Parse Hurdat Dataset
    1a. Load Data
    1b. Format Into Dictionary
    1c. Feature Engineering for Individual Storms

  2. Analyze Hurricane Paths

  3. Exploratory Data Analysis of Atlantic Hurricanes 1850-2017
    3a. Number of Storms
    3b. Storm Duration Over Time

1. Parse Hurdat Dataset

Data Description: https://www.nhc.noaa.gov/data/hurdat/hurdat2-format-atlantic.pdf

1a. Load Data

Hurricane data from the NOAA comes in a pretty raw format. We will first:

  1. Read in the raw data using read_hurdat()
  2. Convert the lat/lon which is in a string format using directional indivators into numeric values using convert_lat_lon()
  3. Categorize the wind speed severity of a hurricane at any given moment using hurr_category()

Lets look at what a raw record looks like (Hurricane RINA):

From above, it is evident that the data is not very structured. We will parse each record into a nicely formatted dictionary for easier analysis.

1b. Format Into Dictionary

Lets see what that same record for Hurricane RINA looks like now:

Much better!

1c. Feature Engineering for Individual Storms

In order to test our hypotheses, we will need some additional data for each hurricane that isn't directly available at the moment, including:

Now, let's look Hurricane Rina again to see what we added

Great, the data looks ready for some analysis to test our hypotheses! Let's save this cleaned up version before proceeding.

2. Analyze Hurricane Paths

Using our cleaned HURDAT data, we will plot and compare the paths of hurricanes based on different time periods, to get a visual sense of how hurricanes have changed over time due to climate change.\

2a. Compare Hurricane Paths

Before diving into tesitng our hypotheses, we want to look at some maps to visualize how hurricanes have changed over the last 100 years. We will start by looking at the periods 1918-1927 vs 2008-2017.

Our hypothesis states that we expect:

  1. More hurricanes in the 2008-2017 period
  2. A higher number of very strong storms (Cat 4 & 5)

3. Exploratory Data Analysis of Atlantic Hurricanes 1850-2017

After cleaning and parsing the HURDAT dataset on Atlantic Ocean hurricanes, we will do some initial exploratiry data analysis to get a better feel for the data. The aim of this is to potentially uncover some interesting trends and features of the dataset through visualizations and analysis.

3a. Number of Storms

There seems to be a definite trend towards an increasing number of storms since 1851. Let's go deeper and look at the number of hurricanes in each year. A hurricane is a storm categorized as having wind speed > 74mph.

What's interesting to note is that there are more years with > 10 hurricanes compared to in the past. In the period 1851-1990, there were 9 years where the number of hurricanes was 10 or greater. Since 1990, there have been 6 years with 10 or more hurricanes.

3b. Storm Duration Over Time

According to the HURDAT2 data dictionary, data on tropical storms and depressions were only started in the 1950s and 1960s. If we include all data from 1851, this may skew the results as data from 1850-1950 only includes the major storms

3c. Number of Landfalls Over Time

4. Hypothesis Analysis

Hypothesis 1

Storms are making more frequent landfalls due to increased strength and duration of systems.

We will focus our analysis on the following metrics to understand hurricane activity across different periods of time.

Analyze Historical Averages for 10-year periods

In the 100-year period between 1918-2017, we will analyze the key statistics of each 10-year sub-period (e.g. 1918-1927, 1928-1937, etc).

Obtain history

Visualize

We will start by visualizing absolute values (number of hurricanes, storms, etc).

Now, lets visualize the percentage-adjusted metrics (percentage of storms as hurricanes, landfall, etc.) to get a better sense of whether there are more powerful storms now compared to historically.

Analyze Historical Averages for 5-year periods

It would be interesting to see if more granular intervals can also capture the general trend of climate change, or whether this range may just be too small to notice anything meaningful due to the variability of hurricanes. This would be 5-year sub-periods (e.g. 1918-1922, 1923-1927, etc).

Visualize

We will start by visualizing absolute values (number of hurricanes, storms, etc).

Now, lets visualize the percentage-adjusted metrics (percentage of storms as hurricanes, landfall, etc.) to get a better sense of whether there are more powerful storms now compared to historically.

Look at Average duration of storms

2 year periods

Visualize

We will start by visualizing absolute values (number of hurricanes, storms, etc).

Hypothesis 2 (TBD)

Storms have increased in strength (wind speed) and intensity (minimum pressure) over time due to rising sea temperatures that have been observed (approximately 0.13 degrees Celsius per decade over the last 100 years).

TODO: